National Repository of Grey Literature 8 records found  Search took 0.00 seconds. 
Algorithms for anomaly detection in data from clinical trials and health registries
Bondarenko, Maxim ; Blaha, Milan (referee) ; Schwarz, Daniel (advisor)
This master's thesis deals with the problems of anomalies detection in data from clinical trials and medical registries. The purpose of this work is to perform literary research about quality of data in clinical trials and to design a personal algorithm for detection of anomalous records based on machine learning methods in real clinical data from current or completed clinical trials or medical registries. In the practical part is described the implemented algorithm of detection, consists of several parts: import of data from information system, preprocessing and transformation of imported data records with variables of different data types into numerical vectors, using well known statistical methods for detection outliers and evaluation of the quality and accuracy of the algorithm. The result of creating the algorithm is vector of parameters containing anomalies, which has to make the work of data manager easier. This algorithm is designed for extension the palette of information system functions (CLADE-IS) on automatic monitoring the quality of data by detecting anomalous records.
Algorithms for anomaly detection in data from clinical trials and health registries
Bondarenko, Maxim ; Blaha, Milan (referee) ; Schwarz, Daniel (advisor)
This master's thesis deals with the problems of anomalies detection in data from clinical trials and medical registries. The purpose of this work is to perform literary research about quality of data in clinical trials and to design a personal algorithm for detection of anomalous records based on machine learning methods in real clinical data from current or completed clinical trials or medical registries. In the practical part is described the implemented algorithm of detection, consists of several parts: import of data from information system, preprocessing and transformation of imported data records with variables of different data types into numerical vectors, using well known statistical methods for detection outliers and evaluation of the quality and accuracy of the algorithm. The result of creating the algorithm is vector of parameters containing anomalies, which has to make the work of data manager easier. This algorithm is designed for extension the palette of information system functions (CLADE-IS) on automatic monitoring the quality of data by detecting anomalous records.
Boxplot for multivariate data
Brabenec, Tomáš ; Nagy, Stanislav (advisor) ; Hlávka, Zdeněk (referee)
We will introduce three methods of extension of the classical Tukey's Boxplot for multivariate data. These are the Rangefinder, the Relplot and the Bagplot. To implement the methods, we will need the notions like Mahalanobis distance, elliptically symmetric distributions and halfspace depth. A big part of the thesis is focused on the construction of the Relplot and the Bagplot. We will also discuss, how do these methods detect outliers and what are their advantages and disadvantages. This work contains many examples and illustrating images. 1
Geometric approach to the estimation of scatter
Bodík, Juraj ; Nagy, Stanislav (advisor) ; Antoch, Jaromír (referee)
In this thesis we describe improved methods of estimating mean and scatter from multivariate data. As we know, the sample mean and the sample variance matrix are non-robust estimators, which means that even a small amount of measurement errors can seriously affect the resulting estimate. We can deal with that problem using MCD estimator (minimum covariance determinant), that finds a sample variance matrix only from a selection of data, specifically those with the smallest determinant of this matrix. This estimator can be also very helpful in outlier detection, which is used in many applications. Moreover, we will introduce the MVE estimator (minimum volume ellipsoid). We will discuss some of the properties and compare these two estimators.
Algorithms for anomaly detection in data from clinical trials and health registries
Bondarenko, Maxim ; Blaha, Milan (referee) ; Schwarz, Daniel (advisor)
This master's thesis deals with the problems of anomalies detection in data from clinical trials and medical registries. The purpose of this work is to perform literary research about quality of data in clinical trials and to design a personal algorithm for detection of anomalous records based on machine learning methods in real clinical data from current or completed clinical trials or medical registries. In the practical part is described the implemented algorithm of detection, consists of several parts: import of data from information system, preprocessing and transformation of imported data records with variables of different data types into numerical vectors, using well known statistical methods for detection outliers and evaluation of the quality and accuracy of the algorithm. The result of creating the algorithm is vector of parameters containing anomalies, which has to make the work of data manager easier. This algorithm is designed for extension the palette of information system functions (CLADE-IS) on automatic monitoring the quality of data by detecting anomalous records.
Algorithms for anomaly detection in data from clinical trials and health registries
Bondarenko, Maxim ; Blaha, Milan (referee) ; Schwarz, Daniel (advisor)
This master's thesis deals with the problems of anomalies detection in data from clinical trials and medical registries. The purpose of this work is to perform literary research about quality of data in clinical trials and to design a personal algorithm for detection of anomalous records based on machine learning methods in real clinical data from current or completed clinical trials or medical registries. In the practical part is described the implemented algorithm of detection, consists of several parts: import of data from information system, preprocessing and transformation of imported data records with variables of different data types into numerical vectors, using well known statistical methods for detection outliers and evaluation of the quality and accuracy of the algorithm. The result of creating the algorithm is vector of parameters containing anomalies, which has to make the work of data manager easier. This algorithm is designed for extension the palette of information system functions (CLADE-IS) on automatic monitoring the quality of data by detecting anomalous records.
Evolutionary Algorithms for Data Transformation
Švec, Ondřej ; Pilát, Martin (advisor) ; Neruda, Roman (referee)
In this work, we propose a novel method for a supervised dimensionality reduc- tion, which learns weights of a neural network using an evolutionary algorithm, CMA-ES, optimising the success rate of the k-NN classifier. If no activation func- tions are used in the neural network, the algorithm essentially performs a linear transformation, which can also be used inside of the Mahalanobis distance. There- fore our method can be considered to be a metric learning algorithm. By adding activations to the neural network, the algorithm can learn non-linear transfor- mations as well. We consider reductions to low-dimensional spaces, which are useful for data visualisation, and demonstrate that the resulting projections pro- vide better performance than other dimensionality reduction techniques and also that the visualisations provide better distinctions between the classes in the data thanks to the locality of the k-NN classifier. 1
Discriminant and cluster analysis as a tool for classification of objects
Rynešová, Pavlína ; Löster, Tomáš (advisor) ; Řezanková, Hana (referee)
Cluster and discriminant analysis belong to basic classification methods. Using cluster analysis can be a disordered group of objects organized into several internally homogeneous classes or clusters. Discriminant analysis creates knowledge based on the jurisdiction of existing classes classification rule, which can be then used for classifying units with an unknown group membership. The aim of this thesis is a comparison of discriminant analysis and different methods of cluster analysis. To reflect the distances between objects within each cluster, squeared Euclidean and Mahalanobis distances are used. In total, there are 28 datasets analyzed in this thesis. In case of leaving correlated variables in the set and applying squared Euclidean distance, Ward´s method classified objects into clusters the most successfully (42,0 %). After changing metrics on the Mahalanobis distance, the most successful method has become the furthest neighbor method (37,5 %). After removing highly correlated variables and applying methods with Euclidean metric, Ward´s method was again the most successful in classification of objects (42,0%). From the result implies that cluster analysis is more precise when excluding correlated variables than when leaving them in a dataset. The average result of discriminant analysis for data with correlated variables and also without correlated variables is 88,7 %.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.